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We investigated the properties of Boolean networks that follow a given reliable trajectory in state 
space. A reliable trajectory is defined as a sequence of states which is independent of the order in 
which the nodes are updated. We explored numerically the topology the update functions, and the 
state space structure of these networks, which we constructed using a minimum number of links 
and the simplest update functions. We found that the clustering coefficient is larger than in random 
networks, and that the probability distribution of three-node motifs is similar to that found in gene 
regulation networks. Among the update functions, only a subset of all possible functions occur, and 
they can be classified according to their probability. More homogeneous functions occur more often, 
leading to a dominance of canalyzing functions. Finally, we studied the entire state space of the net- 
works. We observed that with increasing systems size, fixed points become more dominant, moving 
the networks close to the frozen phase. 

PACS numbers: 89.75.Da,05.65.+b,91.30.Dk,91.30.Px 
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I. INTRODUCTION 

Boolean networks (BNs) are used to model the dynam- 
ics of a wide variety of complex systems, ranging from 
neural networks Q] and social systems [2J to gene regula- 
tion networks |3J. BNs are composed of nodes with binary 
states, coupled among each other. The state of each node 
evolves according to a function of the states from which it 
receives its inputs, similarly to what is done when using 
cellular automata [4], but in contrast to cellular automata, 
BNs have no regular lattice structure, and not all nodes are 
assigned the same update function. 

The simplest type of BNs are random BNs |5|, where 
the connections and the update functions are assigned at 
random to the nodes. These random models have the ad- 
vantage of being accessible to analytical calculations, thus 
permitting a deep understanding of such systems 1 6 [ . Ran- 
dom BNs can display three types of dynamical behavior, 
none of which is very realistic: in the "frozen" phase, most 
or all nodes become fixed in a state which is independent 
of the initial conditions. In the "chaotic" phase, attractors 
of the dynamics are extremely long, and dynamics is very 
sensitive to perturbations. At the critical point between 
these two phases, attractor numbers are huge and depend 
strongly on the update scheme used (ZHSl. 

In contrast to random BNs, real biological networks typ- 
ically display a highly robust behavior. For instance, the 
main dynamical trajectory of the yeast cell-cycle network 
model derived by Li et al. J9| changes little when the 
nodes are updated in a different order, and the system 
returns quickly to this trajectory after a perturbation. In 
fact, whenever the functioning of a system depends on the 
correct execution of a given sequence of steps, the system 
must be robust with respect to the omnipresent effects of 
noise. 
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Motivated by this requirement, we focus in the present 
paper on the robustness of dynamical trajectories under 
fluctuations in the time at which the nodes are updated. 
We consider the extreme case, where we require the sys- 
tem to have a trajectory that is completely robust under 
a change in the update sequence. This means that at any 
time all but one node would remain in their present state 
when they are updated. 

In contrast to the standard approach to BNs, where first 
the network structure (i.e., the topology and update func- 
tions) is defined and then the dynamics is investigated, 
we define first the dynamical trajectory and then construct 
networks that satisfy this trajectory, with the trajectory be- 
ing robust under changes in the update sequence. A sim- 
ilar method has been used in HTOl . In the next section, we 
will define the model and methods used. Then, we will 
discuss the properties of the networks constructed by this 
methods, considering the topology, the update functions, 
and the state space structure. Finally, we will outline di- 
rections for further investigations. 



II. THE MODEL 

A BN is defined as a directed network of N nodes rep- 
resenting Boolean variables cr e { 1 , 0} N , which are subject 
to a dynamical update rule, 

ffi(t + l) = fi(ff(t))ui(t) + ffi(t)[1 -ut(t)] (1) 

where f t is the update function assigned to node i, which 
depends exclusively on the states of its inputs. The binary 
vector u(t) represents the update schedule, and has com- 
ponents u-i(t) = 1 if node i should update at time t, or 
u\(t) — if it should retain the same value. The update 
functions ft are conveniently indexed by the outputs of 
their truth table as follows: Given an arbitrary input order- 
ing, each input value combination cfj = {o~0/ o~i , . . . , cr 1c _ 1 } 
will have an associated index c(cr) = Y-i ^ which 
uniquely identifies it. Any update function f can in turn 
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be uniquely indexed by f = Y.j f(Cj)2 c '°'j', where f((Tj) is 
the output of the indexed function, given the input value 
combination o~j . 

The update schedule can be chosen in three different 
ways: (a) Synchronous (parallel update), where u(t) = 1, 
and all nodes are updated simultaneously every time step; 
(b) Asynchronous and deterministic, where, for instance, 
u(t) = {1 - 8((t+ t°) mod ti)}, where ti is the period 
with which vertex i is updated, t? is a local phase, and 
0(x) is the Heaviside step function; and finally (c) Asyn- 
chronous and stochastic, where Uj = 1 and = 0; in 
the fully stochastic case j is a random value in the range 
[1, N], chosen independently at each time step. 

The choice of update schedule should take into account 
the fact that processes in biological (cellular) networks are 
subject to stochastic fluctuations which can affect the tim- 
ing of the different steps. In principle, a network could 
be organized such that the time interval between subse- 
quent updates is so large that the update sequence is not 
affected by a small level of noise in the update times. In 
this case, an asynchronous deterministic updating scheme 
would be appropriate. However, more generally, the noise 
will also affect the sequence in which nodes are updated, 
suggesting an updating scheme that contains some degree 
of stochasticity. 

In principle, networks can respond in different ways to 
stochasticity in the update sequence (see Fig. HI: (a) The 
system has no specific sequence of states, andit quickly 
looses memory of its past states, (b) The system has some 
degree of ordering in the sequence of states, with "check- 
point" states that occur in a given order, and with certain 
groups of states occurring in between, (c) The system has 
entirely reliable dynamics, where the sequence of states is 
always the same on the attractor, no matter in which order 
the nodes are updated. 

In this paper, we will focus on systems that have an at- 
tractor that has entirely reliable dynamics. Many cellular 
processes, such as the response to some external signal, 
or the cell cycle, can only function correctly if the system 
goes through the required sequence of states in the cor- 
rect order. Therefore, considering the idealization of fully 
reliable dynamics is biologically motivated. Furthermore, 
studying networks with entirely reliable dynamics is also 
of theoretical interest, since it is an idealized situation on 
which one can build when studying more complicated 
cases. Entirely reliable dynamics can be implemented by 
enforcing that consecutive states of the attractor trajectory 
differ only in the value of one node. In other words, the 
Hamming distance between successor states is always 1. 
It is obvious that this is the only possible type of trajectory 
that can be entirely independent of the update schedule. If 
two subsequent states differed by the state of two or more 
nodes, then it would be possible to devise an update se- 
quence which would update one node but not the other, 
in contradiction to our assumption. 

Entirely reliable attractors are represented in state space 
as simple loops. We denote the number of different states 
on the attractor by L = Y.i U/ where It is the number of 
times node i changed its state during a full period (since 



the trajectory is periodic, l| must be equal to or a multi- 
ple of 2). Furthermore, if the states of the system were rep- 
resented by the corners of a N -dimensional Hamming hy- 
percube, the trajectory should follow its edges (see Fig. 12} . 
The shortest possible trajectory length, considering that no 
node remains at a constant value, is L = 2N, with l| = 2 for 
all nodes. The longest possible trajectory length is L = 2 N , 
where all states of the system are visited, and the trajectory 
corresponds to a Hamiltonian walk on the N -dimensional 
Hamming hypercube [17J. 




(a) Stochastic dynamics 



(b) "Checkpoint" states 




(c) Entirely reliable trajectory 

FIG. 1 : (Color online) Ilustration of levels of dynamical reliability. 
Each node on the graphs above is a state of the system, and the 
edges represent possible transitions between them. 
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FIG. 2: Example of reliable trajectory of length 6 on a system of 
size N = 3. 



III. MINIMAL RELIABLE NETWORKS 

A. Construction rule 

The goal of this section is to construct BNs that have a 
^iven entirely reliable trajectory, and to investigate their 
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properties. A fully reliable trajectory has the property 
that the sequence of states is independent of the updat- 
ing scheme, which means that under parallel update only 
one node at a time changes its state. How networks that go 
through a given sequence of states can be constructed, was 
demonstrated by Lau et al 1 10 1, who investigated all possi- 
ble networks which exhibit the main trajectory of the Yeast 
cell cycle regulatory network. Thus, we first define the 
dynamics, from which we obtain the topology and func- 
tions, opposite to what is usually done in the literature on 
Boolean Networks. 

In fact, there exist many networks that display a given 
trajectory. Even when the full state space structure is spec- 
ified, which defines the successor state of each of its 2 N 
possible states, it is possible to construct a network that 
has this state space structure. This can be done by con- 
structing a fully connected graph with k = N and by as- 
signing to each node the function that has the required 
output for each of the 2 N input states. In the end, in- 
puts that never affect the output can be removed. If there 
are different sets of inputs that can be simultaneously re- 
moved, different networks are obtained. 

When not the entire state space structure, but only one 
reliable trajectory is specified, there exist consequently 
many networks with different topology and functions 
which have this trajectory and may differ in the rest of 
their state space. We will restrict ourselves to minimal net- 
works, i.e., networks with the smallest possible number 
of inputs for each node and the simplest possible func- 
tions, which have the maximum possible number of iden- 
tical entries in the truth table. This minimality condition is 
motivated by the putative cost associated with more con- 
nections or more complicated functions, which would de- 
crease the fitness of an organism. This is in contrast to 
what was done in |10|, where all possible networks were 
considered, which is only feasible on very small systems. 

Such minimal networks can be constructed by a 
straightforward algorithm, because the inputs and the 
function required for each node can be determined in- 
dependently from those of all the other nodes. The in- 
puts for a given node must include all predecessor nodes, 
which change their state 1 time step before the given node 
changes its state. Additional inputs are required if the 
given node assumes, during the course of the trajectory, 
different binary states for the same configuration of the 
predecessor nodes. The choice of these "excess" inputs is 
usually not unique and may include self inputs. We per- 
form this choice at random, but only from the possibilities 
which minimize the number of inputs to each node. If not 
all possible configurations of the states of the input nodes 
occur during the course of the trajectory, the update func- 
tion of the given node is not unique. We first assign those 
truth table entries of the update function that are speci- 
fied by the trajectory. Then, we assign to all remaining 
entries the same output value, and we choose the majority 
of output values assigned so far. (If there is no majority, 
we choose either value with probability 1/2.) 

The algorithm used for choosing the minimal set of in- 
puts proceeds as follows: To each node, we first assign all 



predecessor nodes as inputs. Then, if needed, we choose 
"excess" inputs. We first set the number of excess inputs 
to k' = 1, and we test in a random order the (w) pos- 
sible node combinations until we find a node set which, 
together with the predecessors, is a valid input set. If no 
valid combination is found, we increase k' by 1 and repeat 
the search. Once a valid combination is found, the corre- 
sponding truth table is completed by applying the mini- 
mality condition to its unspecified entries. The run time 
of this algorithm increases as 0(lN max ( max f k '' 1 ' ), where 
I is the average number of flips per node, and max(k') is 
the maximum value of k' for all nodes. We have observed 
that the run times are feasible for networks of size up to 
N = 400 and I = 12. Qgl 

An example for a reliable trajectory and two possible 
networks with their functions, obtained with the above al- 
gorithm, is given in Figure [3] 




FIG. 3: (Color online) Example of a random reliable trajectory 
for N = 1 and 1 = 4, and two possible minimal networks. The 
edges with dashed (red) lines represent the inputs that are dif- 
ferent between the two networks. Below each network are the 
outputs of the truth table of each node, ordered from top to bot- 
tom, and left to right, according to their input combination in- 
dices. Outputs marked in grey (cyan) correspond to input com- 
binations present in the given trajectory. 

We choose the reliable trajectory at random, without 
taking into consideration possible particular features of 
biological networks, such as different temporal activation 
patterns of the different nodes, which reflect the function 
that the network must fulfill. Instead, we will consider 
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a null model, where the values of the nodes change ran- 
domly. The only restriction which is imposed is that the 
trajectory is reliable. The only two parameters of this tra- 
jectory are the number of nodes N, and the average num- 
ber of flips per node I. We generate a random ensemble 
of reliable trajectories in the following way: First, we de- 
termine how often each node shall be flipped. To this pur- 
pose, for each node i a random number A| is chosen from 
a Poisson distribution with mean (I — 2)/2, implying that 
node i shall be flipped 1^ = 2A| + 2 times. The average 
number of flips of each node is thus identical to I, and each 
node is flipped at least twice. The length of the trajectory 
is then L = 2N + 2 = k- Then, we arrange these 

flips in a randomly chosen order. If the resulting trajectory 
contains the same network state twice, it is discarded, and 
a new sequence of flips is chosen. 




B. Topological characteristics 

We first present results for the topological characteristics 
of the obtained networks. We evaluate the degree distri- 
bution and the local correlations. The degree distribution 
is of course strongly dependent on I. Local correlations 
can arise when two nodes that are influenced by the same 
nodes are more likely to influence each other. 

Unless stated otherwise, we averaged the results from 
several independent realizations of the minimal trajecto- 
ries and minimal networks, for different N and I. The 
number of realizations for small N, up to 20, were at least 
2000. For intermediary values of N, up to 100, it varied 
from 50 to 300, depending on I. For the larger networks, 
N > 1 00, it ranged from 200 to 6 networks for I < 1 2, and 
one realization for N = 400 and 1=12. 



1. Degree distribution 

The number of inputs of a node is at least as large as 
the number of its predecessors. Whenever the state of the 
node cannot be written as a function of the predecessors 
alone, "excess" inputs must be chosen, as already men- 
tioned before. The number of different predecessors n p 
per node approaches, for large N, on average I, since it be- 
comes unlikely for large N that the same node is chosen 
twice as predecessor. The typical truth table size grows 
therefore with I as 2 l . Since the number of different pre- 
decessor states grows only quadratically as n p l ~ I 2 , one 
can expect the number of "excess" inputs to be small, and 
number of inputs per node should be 

(k) ~ I, (2) 

for sufficiently large N . This is confirmed by our numeri- 
cal investigations, as is shown in Fig. E] 



FIG. 4: (Color online) Average degree (k) as function of I for net- 
works of different size N. The straight line is the function (k) = I. 



The degree distribution mirrors the distribution of the 
number of predecessors. Since all nodes flip on average 
the same number of times, the distribution is expected to 
follow a Poisson distribution for large enough I. This is 
indeed the case, as Fig.|5]shows. For small I however, the 
distributions are more narrow, because we imposed the 
condition that each node flips at least twice, leaving little 
freedom for additional predecessors when I is close to 2. 
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FIG. 5: (Color online) In-degree and out-degree distributions of 
minimal networks for different values of I, for N = 1 00. The solid 
lines correspond to Poisson distributions with the same average. 
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2. Local correlations 

We obtained information about the local topology of 
the minimal networks by evaluating the probability that 
the neighbours of a given node are connected to each 
other. This probability is the so-called clustering coefficient 
(c) Random uncorrelated networks show absence of 
clustering only in the limit N — * oo. Thus, for finite N, 
it is necessary to compare the obtained value with a ran- 
dom network of equal size and with equal degree distri- 
bution. In order to do this, we calculated the clustering 
coefficient (c s ) on shuffled networks, where the links were 
rewired randomly, preserving the in- and out-degree of 
each node. We then calculated the ratio (c)/(c s ), for net- 
works of different size and average flip number I. If the 
ratio approaches 1 , the network does not exhibit any spe- 
cial clustering. The results for several values of N and I 
are shown in Fig. [6] 




FIG. 6: (Color online) Clustering ratio (c)/{c s ) as a function of 
the average number of flips per node I, for different network 
sizes N . The gray straight line corresponds to a decay of the type 
1/N. 

The most evident feature of Fig. [6] is that clustering is 
stronger for smaller 1, i.e., for sparse networks. For larger I 
(and hence larger (k) ), the average distance between nodes 
decreases, and the shuffled and original networks have a 
similar degree of clustering. This difference between net- 
works with smaller and larger average degree becomes 
more pronounced when the size of the networks N is in- 
creased. From the data in Fig. [6| it appears that that the 
ratio (c)/(c s ) increases slowly with N. We will argue in 
the following that this ratio will reach a finite asymptotic 
value in the limit N — > oo. 

The finding that the clustering coefficients are larger 
than for random networks can be explained by consider- 
ing the above-mentioned excess inputs that are required 
when the function assigned to a node cannot be based on 
its predecessor nodes alone. Let us consider two consec- 
utive flips of a node j on a given trajectory. These flips 
are preceded by flips of the predecessor nodes, which we 
call v and w. The average time between the two consid- 



ered flips of node ) is ~ L/l = N, implying that there is a 
considerable probability that node v flips again before the 
second flip of node ), giving the sequence 



V) 



■ v ■ ■ Wf . 



The update function assigned to node j needs an excess 
input if neither node w nor any other predecessor of node 
j (which can exist only for I > 2) flips between the first 
flip of j and the second flip of v. The simplest choice of 
this excess input is node ) itself. Indeed, self-inputs occur 
more often than in the shuffled networks, as is shown in 
Fig. [7] Since the number of different possible excess in- 
puts is proportional to N, we expect that the fraction 
of nodes with self -inputs decreases as ri\ ~~ 1 /N for large 
N, but remains larger than that of shuffled networks by a 
constant multiplicative factor. 
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FIG. 7: (Color online) Fraction of nodes with self-input, as a 
function of N, for different values of I. The dashed curves are 
obtained for shuffled networks, with the same degree sequence. 
The inset shows the ratio n^/nf, where n[ is the self-input ratio 
for the shuffled networks. 

The excess input cannot be a self-input if node w flips 
also in the same interval, giving the sequence 



V) 



■ W • W) . 



In this case, an excess input u must be chosen among those 
nodes that flip between the two consecutive flips of node 
w, if none of the other predecessors of ) flips in this inter- 
val, giving the sequence 

vj ■ ■ • V • • ■ w • • ■ u ■ ■ • wj . 

Now, the average distance between the flips of node w and 
node u is smaller than that between two randomly chosen 
nodes, since W is required to flip in the indicated interval. 
Therefore, the probability that w is an input to u or vice 
versa is larger than random, and it scales as 1/N in the 
limit N — * oo. Since w and u are inputs to j, it follows that 
the clustering coefficient is larger than the random value 

From this consideration, it follows that the ratio (c)/(c s ) 
approaches a constant value in the limit N — > oo. Further- 
more, it follows that this ratio is larger for smaller I, since 
it is less likely that there exist additional inputs to j that 
flip in the required interval and make excess inputs un- 
necessary. The slight increase seen in Fig. [6] can probably 
be attributed to a finite-size effect. 
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FIG. 8: (Color online) The z-score of the different three-node 
subgraphs of minimal reliable networks, for different values of 
1 and N = 100. The profile z s t corresponds to the the signal- 
transduction interaction network in mammalian cells 1141. 



In order to determine which three-node subgraphs con- 
tribute to the increased clustering, we evaluated their z- 
score, which indicates to what extent the frequency of each 
subgraph is different compared to the random case. The z- 
score is defined as 



(Ni)-(N X S ) 
>/((N() 2 >-<Nf> 2 ' 



(3) 



where N | is the number of occurrences of subgraph i, and 
N | is the number of occurrences of the same subgraph on 
a shuffled network with the same degree sequence. Fig. [8] 
shows the different possible subgraphs and their z-score. 
Subgraphs with more links have a higher z-score and are 
therefore network motifs. Sparser subgraphs, where there 
is no link between two of the nodes, are rarer than at 
random, as predicted by the clustering coefficient. The 
abundance of denser motifs increases with I, as the net- 
work itself becomes more dense, but the overall trend of 
the z-score is the same. One peculiar feature is the ab- 
sence of simple loops (subgraph 6), also know as feed- 
back loops HT2H . As was described above, the clustering 
is mostly due to the correlations between the inputs of a 
given node. A simple loop does not have this type of cor- 
relation. Furthermore, it was shown by Klemm et al 1T3| 
in a study of the reliability of small Boolean networks, that 
feedback loops are harmful to reliable dynamics. These 
authors obtained a z-score profile very similar to Fig.|8](see 
Fig. 4 of 1 13 1). They also showed that this profile is quali- 
tatively similar to real biological networks studied in [14 1. 
A direct comparison is shown in Fig.|8j with the motif pro- 
file of the signal-transduction interaction network in mam- 
malian cells ITU . 
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(c) k = 3, without self-loops. 
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(d) k = 4, without self-loops. 

FIG. 9: (Color online) Distribution of the different update func- 
tions, for different numbers of inputs, k, and for different flip 
numbers, I, for networks of size N = 20. 
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In Fig |8j we did not keep track of the self -inputs, for 
simplicity. When self -loops are included in the subgraphs, 
their number increases from 13 to 86, which makes the 
analysis and presentation more elaborate. We performed 
this analysis and found that a subgraph with a specific 
number of self -loops has a larger z-score than its counter- 
part with less or no self -loops. The z-score pattern of Fig.|HJ 
on the other hand, is repeated for subgraphs which share 
the same number of self -loops, which shows that motif oc- 
currence and self -regulation are largely independent. 



C. Properties of update functions 

We evaluated the frequency of the different types of up- 
date functions in minimal networks, for different values 
of I, see Fig. [9] Unless otherwise stated, the results were 
obtained from 1 4 independent realizations of networks 
with N = 20. We compared the results with those obtained 
for larger values of N, with no discernible difference other 
than the reduced statistical quality. Functions with differ- 
ent numbers of inputs were evaluated separately. 

The functions seem to be distributed according to differ- 
ent classes, where functions of the same type occur with 
the same probability, while some do not occur at all. In 
order to understand this distribution, it is necessary to de- 
scribe in detail what conditions need to be met by the func- 
tions, according to the imposed dynamics and construc- 
tion rules. 

The subsystem composed only of the inputs of a given 
node follows a certain "local trajectory" (i.e., sequence of 
states), which determines, together the minimality condi- 
the update function of the consid- 



tion described in Sec. Ill 
ered node. The probabilities of the different possible tra- 
jectories depend on the way the global trajectory is speci- 
fied, and on the rules for choosing excess inputs. The re- 
strictions imposed on the local trajectories of the inputs are 
as follows: 

1. The local trajectory of the inputs must correspond to 
a periodic walk on the k-dimensional hypercube rep- 
resenting their states, since the Hamming distance at 
each step must be 1 . We note that in this subsystem, 
the same input state is allowed to repeat within a pe- 
riod (only the global state cannot). The vertices of the 
hypercube can be annotated with the output value 
of the function at the corresponding input state (see 
Fig. 10 for examples). 



2. For large N, the trajectories of any two different 
nodes will be approximately random and uncorre- 
lated. The only restriction is that every face of the 
hypercube will be visited exactly l v times, where v is 
the index of the input node that has a fixed state on 
this face. On average we have (l v ) = I. 

3. The output values of the function can be distributed 
on the vertices of the hypercube that are visited dur- 
ing the walk in any possible way, with the restriction 
that the output value must change lj times along the 



walk, where j is the index of the considered node. 
An exception are functions with self -inputs: the ver- 
tices on the hypercube face corresponding to the self- 
input must all have the same output value. 

4. The output values at the vertices of the hypercube 
which are not visited by the walk must be equal to 
the majority of the output values on the walk (this is 
the minimality condition defined in sectionpTl). 



5. Functions that can be reduced to a function with 
smaller k cannot occur due to the minimality con- 
dition, and the corresponding trajectory can be con- 
fined to a hypercube of smaller dimension. 



Fig. 10 shows examples of trajectories that are allowed or 
not allowed for the case k = 3. 
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trajectory (restriction 2) (restriction 4) (restriction 5) 

FIG. 10: Example of input and output trajectories on the k- 
hypercube representing the states of the inputs, for functions 
with k = 3. Allowed transitions are represented by arrows. The 
color on each vertex represent the ou tput value. Fig. (a) repre- 



sents one type of valid trajectory Figs, (b) to (d) represent invalid 



trajectories according to the indicated restriction: (b) not all sides 
of the cube are visited; [(c)] the function is not minimal; |(d)| the 
function can be reduced to k = 2. 



The listed restrictions result in the observed distribution 
of update functions. We will describe in detail all the pos- 
sibilities for k = 2, and discuss in a more general and ap- 
proximate manner the functions with k > 2. 



1. Functions with k = 2 

Fig.[9]shows that only 8 of the 16 possible functions oc- 
cur, and all of them with equal probability. They are all 
canaly zing functions, with three entries 1 (or 0) in the truth 
table, and one entry (or 1). The hypercube representa- 
tion of all functions is shown in Fig. 11 The functions that 
are not possible are obviously the constant functions (first 
row of Fig. 
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from left to right), and the functions which 
are insensitive to one of their inputs, due to restriction 4 
(second and third row). The other functions which do not 
occur are the reversible functions, which change the out- 
put at every change of an input (fourth row). Those func- 
tions, however, are not entirely impossible: It is possible 
to construct a trajectory that meets all the listed require- 
ments, with the specification that the output flips as often 
as all inputs together (restrictions 2 and 3). Such trajecto- 
ries follow the pattern 

vj • • • wj • • • vj • • • wj, 
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where v and w are the inputs of j . This pattern is impossi- 
ble for 1 — 2, but can occur for larger I, albeit with a small 
probability, since k = 2 functions are less likely for larger 
I; furthermore, the probability that a node has two prede- 
cessors which occur twice decreases with N as - 1 /N 2 . 



o-o • - o •-• o - • 

o-o • - o o-o • - o 

• -• o - • o-o • - o 

• -• o - • •-• o - • 



o-o • - o o - • o-o 

•-o o-o o-o o - • 

•-• o - • • - o •-• 

o - • •-• •-• • - o 



FIG. 11: Representation of all 16 functions with k = 2 on the 
2-hypercube. On the left are the functions which do not (or 
rarely) occur in the minimal networks, and on the right are the 
canalysing functions which occur with equal probability. 



2. Functions with k > 2 

Functions with k > 2 seem to fall into different classes, 
which occur with different probabilities. This can be seen 
by plotting the distribution of the probabilities Pf of the 
different functions, as shown in Fig. 1^ a) for k = 5. The 
different classes seem to correspond to different function 
homogeneity values, defined as the number of minority 
output values in the truth table, d. This can be verified 
by selecting only those functions with a given value of d, 
and plo tting their distribution of probabilities, as shown 
in Figs 13 c) to [(f)] The most frequent class comprises 
the functions witn only one entry in the truth table de- 
viating from the others (f = 2 1 and f = 2 k — 2 X ), with 
d = 1 (see Fig l^[c)| . Those are canalysing functions, 
where all inputs are canalysing inputs. Functions with the 
same homogeneity fall into subclasses which have differ- 
ent probabilities. Those functions are often negated func- 
tions (f ' = 2 k - f) of one another, and this is due to the 
existence of self-inputs: Self -regulated functions are not 
equivalent functionally when they are negated (the input 
corresponding its own output must be negated as well), 
despite sharing the same homogeneity. The <-> 1 sym- 
metry, however, is always preserved. When self-loops 
are ignored, the distribution becomes symmetric with re- 
spect to negation of the output (see Fig ^ £)}, and the ho- 
mogeneity classification becomes the predominant crite- 



a) 



rion to distinguish between the classes (compare Figs 1 
and Kb)) . But even in the absence of self -loops, the pro 
bility classes are not uniquely defined by the homogene- 
ity, and there are overlaps between the different classes, 
as Figs 12|^d) to (f) show. Nevertheless, there is a general 
tendency that functions with larger d are less likely. 

Fig.[l3]shows the probability of finding a function with a 
given value of d. Since the number of different functions in 
a given class increases rapidly with d for small d, the max- 
imum of this distribution is shifted to values of d larger 
than 1 . If this distribution is corrected by the number N <j 
of different functions found with the same value of d, an 




10 

Pf 

(a) All functions 




10 

Pf 

(b) No self-loops 




10 
Pf 

4, no self-loops 



FIG. 12: (Color online) Distribution of function weights Pf , sub- 
divided according to the value of the truth table homogeneity d, 
for different values of the average flip number I, for fixed values 
k = 5 and N = 20. 



overall decreasing function of d is obtained, as shown in 
the graphs in the left column of Fig. [13). 

The observed difference in probability due to different 
homogeneity can be explained as follows. We consider a 
node with k inputs. We denote by M = Y.i m i <= IV2, 1- 
12 the total time during which the node is in the state that 
it assumes less often. The sum is taken over all intervals 
during which the node has this state. 

If we denote the different possible (combined) states of 
the input nodes by letters, we can represent the sequence 
of states through which the considered node and its input 
states go by the following picture: 




m.2 



The shaded areas correspond to the output value 1. A state 
of the input nodes that appears inside the shaded (clear) 
area, must appear again inside the shaded (clear) area each 
time it is repeated. If we consider only the above scenario, 
and essentially ignore that the trajectories must follow the 
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FIG. 13: (Color online) Distribution of functions with different 
values of the truth table homogeneity d, for different average flip 
number 1, and N = 20. 



edges of a k-hypercube, we can show that functions with 
smaller values of d should occur more often. 

Our approximations rely on the fact that, for N — > oo 
and I > 1 (and hence L — > oo), the shaded areas will be 
more numerous and will be further apart in time and less 
correlated. In this limit, the input state number i occurs, 
say, rii times. The probability that each of the input states 
occurs only in one type of area is given approximately by 



n 



Hi f J_ 

+ 



M 



L 



(4) 



The maximum of this function is attained at M = 1/2 (or 
M = L — 1/2, which is excluded since we chose M. such 
that it counts the minority part), which is the minimal pos- 
sible value. The value of d is bounded by M, but can be 
smaller since the same input state can repeat. We can in 
fact see that the case where the same state repeats at all 
M times is more probable, by considering all the possible 
permutations of the state sequence, for a given value of d, 



n 



M 



TV; 



n 

i>d 



L-M 



d<j<i 



(5) 



and observing that it has a maximum at d = 1, since 
M <C L. (This means that there are M shaded areas of 
size 1 each.) It follows that with increasing 1 the weight 



of update functions with d = 1 will become much larger 
than that of every other update function, as is evident from 
Figs. [9] and 13 The dominance of d = 1 functions can al- 
readybe seen for small values of I, although it is less pro- 
nounced. 



D. State space structure 

Finally, we investigated the state space of the con- 
structed networks. We considered the system under a 
stochastic update scheme, since this scheme underlies the 
study presented in this paper. In this case, we define an 
attractor as a recurrent set of states in state space, with the 
property that there are no transitions that escape this set 
(i.e. a strongly connected component in the state space 
graph that has no outgoing connections). The number of 
states in this set is called the size of the attractor. 

We evaluated the probabilities of attractors of a given 
size on the ensemble of minimal networks, and their aver- 
age basin size. For small networks (up to N = 1 2) the at- 
tractor size probability was obtained by exact enumeration 
of the state space. For larger N, the state space was sam- 
pled, taking care that the same attractor was not counted 
twice. This method, however, leads to a bias, since attrac- 
tors with smaller basins are less likely to be counted, and 
the extent of this bias depends on the size of the network. 
Nevertheless, this bias is not relevant for our point of in- 
terest, which is on the occurrence of various attractors, but 
not on their precise statistics. Fig. [14] shows that there ex- 
ist almost always fixed points, andthat there are often at- 
tractors which are much larger than the imposed reliable 
trajectory (we considered attractors of up to n a = 10 5 
states). Note that the probabilities in Fig. [l4]do not sum 
to 1 , since a given network may have many attractors of 
different sizes. 

The basin of attraction was measured as the probabil- 
ity of reaching an attractor, starting from a random con- 
figuration. Fig. 15 shows that the omnipresent fixed point 
has a large basin of attraction. Larger attractors occur with 
smaller probabilities. The weight of the fixed point, com- 
pared to the weight of the imposed reliable trajectory, in- 
creases with increasing N . This can be explained by the en- 
tries in the truth table which are not uniquely determined 
by the reliable trajectory: While number of entries fixed 
throughout the trajectory grow linearly with I, the number 
of remaining entries (as well as their contribution to the 
state space) grow exponentially. In this increasingly large 
region of the state space, the functions behave as constant 
functions. 

Attractors which are larger than the given trajectories 
are due to a portion of network begin frozen in the value 
they have at the fixed point, while other nodes remain 
frustrated, and their states change stochastically, visiting 
a larger portion of the state space, without entering the 
fixed point or the reliable trajectory. 

For comparison, we briefly looked at the attractor sizes 
obtained using a synchronous updating scheme. Not sur- 
prisingly, the attractors become much shorter in this case, 
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FIG. 14: (Color online) Probability of attractor sizes n a , for I = 
2 and 1 = 7. Attractors corresponding to the given trajectories 
are plotted separately with symbols. For each value of N and 
I, 1 4 different networks were analysed. In the case of attractor 
sampling, 1 00 different random initial conditions per network 
were used. 




FIG. 15: (Color online) Average attractor probabilities (basin 
size), (p Q ), for 1 = 2 and 1 = 7. Attractors corresponding to the 
given trajectories are plotted separately with symbols. For each 
value of N and I, 1 4 different networks were analysed, and 1 00 
different random initial conditions per network were used. 



with attractors larger than the given trajectory having only 
a small probability (not shown). 



IV. CONCLUSION 

We have constructed minimal Boolean networks which 
follow a given reliable trajectory in state space. The tra- 
jectories considered have the necessary feature that only 
one node can change its value at any moment in time, 



which guarantees that the sequence of states is indepen- 
dent of the order in which nodes are updated. Otherwise 
the nodes change their states at randomly assigned times 
in the given trajectory, thus constituting a null model for 
reliable dynamics. The minimality condition imposed on 
the networks was that it contains the smallest possible set 
of inputs for each node that allows for the given trajectory. 
Additionally, the truth table entries that are not fixed by 
the trajectory were set to the majority value imposed by 
the trajectory. We then investigated the topology, the up- 
date functions, and the state space of those networks. 

The network structure, as manifest in the degree dis- 
tribution, does not deviate significantly from a random 
topology. However, the network exhibits larger cluster- 
ing than a random network, and exhibits a characteristic 
motif profile, which resembles both real networks of gene 
regulation and the pattern of dynamically reliable motifs 
found in |13|. The existence of clustering and motifs was 
explained by considering the "excess" inputs that are re- 
quired to avoid contradictions in the truth table, and how 
they must be correlated among each other. 

The update functions of the nodes show a characteristic 
distribution, where only a subset of the possible functions 
occur, and these are divided into distinct classes, which oc- 
cur with different probabilities. The main factor discern- 
ing the different classes is their homogeneity, character- 
ized by the number of entries of the minority bit in the 
truth table. Function with homogeneity 1 occur with in- 
creased probability, and become the dominant functions in 
the limit of large trajectories, I — > oo, for fixed k. Functions 
with more minority entries occur with a smaller probabil- 
ity, and this probability decreases as the number of minor- 
ity entries increase. We presented an analytical justifica- 
tion for this finding, considering how the local trajectory 
of the input states of a given function must behave, in the 
limit 1 > 1 . 

Finally we investigated the state space of the con- 
structed networks, considering the possible attractors it 
can have, in addition to the given reliable trajectory. To 
this aim, we used a stochastic update scheme. We ob- 
served that the network almost always exhibits a fix point 
of the dynamics, and often attractors which can be much 
larger than the given trajectory. The basin size of the fixed 
point is very large, and dominates the basin size of the 
given trajectory in the limit of large system size. This is a 
consequence of the minimality condition imposed on the 
network: The region of state space dictated by the im- 
posed trajectory increases only linearly with system size, 
while the entire state space grows exponentially. Outside 
the state space region fixed by the reliable trajectory, the 
constructed functions behave as constant functions, which 
drive the system nearer to the frozen phase. 

In this work, we have used a null model for reliable tra- 
jectories, where the nodes change their values at random 
times. Real gene regulatory networks deviate significantly 
from this, since they must agree with the cell cycle or the 
pathway taken during embryonic development. Certain 
proteins need to be always present in the cell, while others 
are produced only under specific conditions. The degree 
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distribution and the update functions must reflect this be- 
havior. However, some of the features found for the null 
model presented in this work, should also be present in 
more realistic systems. The existence of clustering, and the 
motif profile found, for instance, do not depend strongly 
on the specific temporal patterns of the nodes, but are im- 
posed by the reliability condition. Similarly, the domi- 
nance of strongly canalyzing functions is a consequence 
of the reliability condition and should be relatively robust 
to the introduction of temporal correlations. Neverthe- 
less, biochemistry makes some canalyzing functions more 
likely than others. 

An important feature of biological networks that is not 
reflected in the null model presented in this paper, is the 
robustness with respect to perturbations of a node. Such a 
robustness can only be obtained when more than the min- 
imum possible number of inputs is assigned to a node. 
Indeed, it has been shown in [15J that more redundancy 
allows for more robustness. Similarly, requiring that the 
reliable trajectory has the largest basin of attraction or that 
other attractors of the system are also reliable trajectories, 



may increase the number of links in the network. 

Finally, the requirement that trajectories are fully reli- 
able is an idealization which goes beyond what is neces- 
sary for gene regulatory networks. Real networks have 
checkpoint states, but between these states, the precise se- 
quence of events is not always important. On the other 
hand, full reliability may be necessary for certain subsys- 
tems of the gene network, where a strict sequence of lo- 
cal states is required. The minimal reliable networks dis- 
cussed in this paper should be compared more realistically 
to such reliable modules. 
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